Learning spectro-temporal representations of complex sounds with parameterized neural networks

نویسندگان

چکیده

Deep Learning models have become potential candidates for auditory neuroscience research, thanks to their recent successes on a variety of tasks. Yet, these often lack interpretability fully understand the exact computations that been performed. Here, we proposed parametrized neural network layer, computes specific spectro-temporal modulations based Gabor kernels (Learnable STRFs) and is interpretable. We evaluated predictive capabilities this layer Speech Activity Detection, Speaker Verification, Urban Sound Classification Zebra Finch Call Type Classification. found out Learnable STRFs are par all tasks with different toplines, obtain best performance Detection. As interpretable, used quantitative measures describe distribution learned modulations. The filters adapted each task focused mostly low temporal spectral analyses show human speech similar parameters as ones measured directly in cortex. Finally, observed organized meaningful way: vocalizations closer other bird far away from urban sounds

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning nonnegative features of spectro-temporal sounds for classification

In this paper we present a method of sound classification which exploits a parts-based representation of spectrotemporal sounds, employing the nonnegative matrix factorization (NMF) [1]. We illustrate a new way of learning nonnegative features using a variant of NMF and show its useful behavior in the task of general sound classification with comparison to independent component analysis (ICA) w...

متن کامل

Learning Anonymized Representations with Adversarial Neural Networks

Statistical methods protecting sensitive information or the identity of the data owner have become critical to ensure privacy of individuals as well as of organizations. This paper investigates anonymization methods based on representation learning and deep neural networks, and motivated by novel informationtheoretical bounds. We introduce a novel training objective for simultaneously training ...

متن کامل

Nonnegative features of spectro-temporal sounds for classification

A parts-based representation is a way of understanding object recognition in the brain. The nonnegative matrix factorization (NMF) is an algorithm which is able to learn a parts-based representation by allowing only non-subtractive combinations (Lee and Seung, 1999). In this paper we incorporate a parts-based representation of spectro-temporal sounds into the acoustic feature extraction, which ...

متن کامل

Learning to localise sounds with spiking neural networks

To localise the source of a sound, we use location-specific properties of the signals received at the two ears caused by the asymmetric filtering of the original sound by our head and pinnae, the head-related transfer functions (HRTFs). These HRTFs change throughout an organism’s lifetime, during development for example, and so the required neural circuitry cannot be entirely hardwired. Since H...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of the Acoustical Society of America

سال: 2021

ISSN: ['0001-4966', '1520-9024', '1520-8524']

DOI: https://doi.org/10.1121/10.0005482